Even with the introduction of the ggplot2 package, several R users still rely on base R (i.e. R without any user-installed packages) to create their plots. While the charts produced tend to be less fancy than their ggplot2 counterparts, the syntax for base R can be very succinct. This is helpful when you are just exploring the data and are not too fussed about presentation.

For this document, we’ll use the built-in mtcars datset as a running example.

data(mtcars)
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Scatterplot

Let’s say we are interested in the relationship between mpg and wt. We can make a scatterplot using the plot command, defining the x and y arguments of the function. (Recall that data frames are really just lists, so mtcars$wt refers to the wt element of mtcars, i.e. the values in the wt column.)

plot(x = mtcars$wt, y = mtcars$mpg)

To have the points be represented by other shapes (instead of white circles), add the pch argument to plot (full list of shapes here):

plot(x = mtcars$wt, y = mtcars$mpg, pch = 5)

To change the size of the points, add the cex option to plot (1 is the default value):

plot(x = mtcars$wt, y = mtcars$mpg, pch = 7, cex = 2)

In some cases (e.g. time series data), we may want lines joining the data points instead of showing just the points themselves. To have lines instead of points, add type = "l" to the plot command. (Other options for type are “o”, “p” and “b”. Try them!)

plot(x = mtcars$wt, y = mtcars$mpg, type = "l")

(Drawing lines doesn’t really make sense in this context, so the above is simply for illustration.) We can have different types of lines by adding an lty option to plot (see here for more line options):

plot(x = mtcars$wt, y = mtcars$mpg, type = "l", lty = "dashed")

For different line widths, use lwd:

plot(x = mtcars$wt, y = mtcars$mpg, type = "l", lwd = 2)

To change the color of the points, use the col option:

plot(x = mtcars$wt, y = mtcars$mpg, pch = 16, col = "blue")

Like ggplot2, we can make the color of the point depend on which category it is in. Let’s say we want to color the points depending on the value of cyl. We first convert cyl to a factor, then modify the value of col in the plot call:

mtcars$cyl <- factor(mtcars$cyl)
plot(x = mtcars$wt, y = mtcars$mpg, pch = 16, col = factor(mtcars$cyl))

To add a legend, follow the plot call with a legend call. The x and y options determine the top-left hand corner of the legend box. (Use the console to figure out what levels(mtcars$cyl) returns. Notice how you have to specify col and pch in the legend call as well. What happens if you don’t include them?)

plot(x = mtcars$wt, y = mtcars$mpg, pch = 16, col = mtcars$cyl)
legend(x = 5, y = 32, legend = levels(mtcars$cyl), col = c(1:3), pch = 16)

The code below shows how you can add titles and change the axis labels:

plot(x = mtcars$wt, y = mtcars$mpg,
     main = "Miles per gallon vs. Weight", xlab = "Weight", ylab = "mpg")

To change the size of the title and the axis labels, use the cex.main and cex.axis options respectively.

Histograms

A histogram shows the frequency count of one variable. To plot a histogram, use the hist command:

hist(mtcars$mpg)

The number of bins is determined by an algorithm that R runs. If you want to specify the number of bins, you can use the breaks option and give it a number:

hist(mtcars$mpg, breaks = 10)

Because of R’s algorithm for determining the number of bins, sometimes the number of bins you get doesn’t correspond exactly to the number you gave to breaks. To have exact control over this, instead of giving breaks an integer, you could give it a vector of “breakpoints” instead. For example, the code below bins the values into (10, 12], (12, 14], …, (32, 34]. (Type ?seq to read the documentation for the seq function and figure out what it returns.)

hist(mtcars$mpg, breaks = seq(10, 34, by = 2))

Boxplots

To make a boxplot, use boxplot:

boxplot(mtcars$mpg)

To make a boxplot for each category of cyl (the syntax is a little bit like that for facet_wrap and facet_grid in ggplot2):

boxplot(mtcars$mpg ~ mtcars$cyl)

Notice how the numbers on the y-axis are rotated. To make them as the numbers on the x-axis, use the las option:

boxplot(mtcars$mpg ~ mtcars$cyl, las = 1)

Bar plots

If I want a bar plot showing how many rows there are for each value of cyl, I have to use the table function in conjunction with the barplot function. (What do you get if you use plot instead of barplot?)

table(mtcars$cyl)
## 
##  4  6  8 
## 11  7 14
barplot(table(mtcars$cyl))

Conclusion

Plotting in base R can be very quick, even though the syntax may be harder to interpret and the outputs may look less professional.

Some other resources if you are interested in learning more about plotting in base R: